141 research outputs found
Υπολοτιστική μελέτη χωροχρονικών ηλεκτροχημικών ταλαντωτών σε διάταξη ενός και δύο ζευγών ηλεκτροδίων
Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) “Υπολογιστική Μηχανική
Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach
We investigate the performance of two machine learning algorithms in the
context of anti-spam filtering. The increasing volume of unsolicited bulk
e-mail (spam) has generated a need for reliable anti-spam filters. Filters of
this type have so far been based mostly on keyword patterns that are
constructed by hand and perform poorly. The Naive Bayesian classifier has
recently been suggested as an effective method to construct automatically
anti-spam filters with superior performance. We investigate thoroughly the
performance of the Naive Bayesian filter on a publicly available corpus,
contributing towards standard benchmarks. At the same time, we compare the
performance of the Naive Bayesian filter to an alternative memory-based
learning approach, after introducing suitable cost-sensitive evaluation
measures. Both methods achieve very accurate spam filtering, outperforming
clearly the keyword-based filter of a widely used e-mail reader
ARResT/AssignSubsets: a novel application for robust subclassification of chronic lymphocytic leukemia based on B cell receptor IG stereotypy.
Abstract
Motivation: An ever-increasing body of evidence supports the importance of B cell receptor immunoglobulin (BcR IG) sequence restriction, alias stereotypy, in chronic lymphocytic leukemia (CLL). This phenomenon accounts for ∼30% of studied cases, one in eight of which belong to major subsets, and extends beyond restricted sequence patterns to shared biologic and clinical characteristics and, generally, outcome. Thus, the robust assignment of new cases to major CLL subsets is a critical, and yet unmet, requirement.
Results: We introduce a novel application, ARResT/AssignSubsets, which enables the robust assignment of BcR IG sequences from CLL patients to major stereotyped subsets. ARResT/AssignSubsets uniquely combines expert immunogenetic sequence annotation from IMGT/V-QUEST with curation to safeguard quality, statistical modeling of sequence features from more than 7500 CLL patients, and results from multiple perspectives to allow for both objective and subjective assessment. We validated our approach on the learning set, and evaluated its real-world applicability on a new representative dataset comprising 459 sequences from a single institution.
Availability and implementation: ARResT/AssignSubsets is freely available on the web at http://bat.infspire.org/arrest/assignsubsets/
Contact: [email protected]
Supplementary information: Supplementary data are available at Bioinformatics online
Short communication: Determination of lactoferrin in Feta cheese whey with reversed-phase high-performance liquid chromatography
Abstract In the current paper, a method is introduced to determine lactoferrin in sweet whey using reversed-phase HPLC without any pretreatment of the samples or use of a separation technique. As a starting point, the most common HPLC protocols for acid whey, which included pretreatment of the whey along with a sodium dodecyl sulfate-PAGE step, were tested. By skipping the pretreatment and the separation steps while altering the gradient profile, different chromatographs were obtained that proved to be equally efficient to determine lactoferrin. For this novel 1-step reversed-phase HPLC method, repeatability was very high over a wide range of concentrations (1.88% intraday to 5.89% interday). The limit of detection was 35.46μg/mL [signal:noise ratio (S/N)=3], whereas the limit of quantification was 50.86μg/mL (S/N=10). Omitting the pretreatment step caused a degradation of the column's lifetime (to approximately 2,000 samples). As a result, the lactoferrin elution time changed, but neither the accuracy nor the separation ability of the method was significantly influenced. We observed that this degradation could be easily avoided or detained by centrifuging the samples to remove fat or by extensive cleaning of the column after every 5 samples
Using Synchronic and Diachronic Relations for Summarizing Multiple Documents Describing Evolving Events
In this paper we present a fresh look at the problem of summarizing evolving
events from multiple sources. After a discussion concerning the nature of
evolving events we introduce a distinction between linearly and non-linearly
evolving events. We present then a general methodology for the automatic
creation of summaries from evolving events. At its heart lie the notions of
Synchronic and Diachronic cross-document Relations (SDRs), whose aim is the
identification of similarities and differences between sources, from a
synchronical and diachronical perspective. SDRs do not connect documents or
textual elements found therein, but structures one might call messages.
Applying this methodology will yield a set of messages and relations, SDRs,
connecting them, that is a graph which we call grid. We will show how such a
grid can be considered as the starting point of a Natural Language Generation
System. The methodology is evaluated in two case-studies, one for linearly
evolving events (descriptions of football matches) and another one for
non-linearly evolving events (terrorist incidents involving hostages). In both
cases we evaluate the results produced by our computational systems.Comment: 45 pages, 6 figures. To appear in the Journal of Intelligent
Information System
Summarization from Medical Documents: A Survey
Objective:
The aim of this paper is to survey the recent work in medical documents
summarization.
Background:
During the last decade, documents summarization got increasing attention by
the AI research community. More recently it also attracted the interest of the
medical research community as well, due to the enormous growth of information
that is available to the physicians and researchers in medicine, through the
large and growing number of published journals, conference proceedings, medical
sites and portals on the World Wide Web, electronic medical records, etc.
Methodology:
This survey gives first a general background on documents summarization,
presenting the factors that summarization depends upon, discussing evaluation
issues and describing briefly the various types of summarization techniques. It
then examines the characteristics of the medical domain through the different
types of medical documents. Finally, it presents and discusses the
summarization techniques used so far in the medical domain, referring to the
corresponding systems and their characteristics.
Discussion and conclusions:
The paper discusses thoroughly the promising paths for future research in
medical documents summarization. It mainly focuses on the issue of scaling to
large collections of documents in various languages and from different media,
on personalization issues, on portability to new sub-domains, and on the
integration of summarization technology in practical applicationsComment: 21 pages, 4 table
Cytogenetic complexity in chronic lymphocytic leukemia: definitions, associations and clinical impact
Recent evidence suggests that complex karyotype (CK) defined by the presence of 653 chromosomal aberrations (structural and/or numerical) identified by chromosome banding analysis (CBA) may be relevant for treatment decision-making in chronic lymphocytic leukemia (CLL). However, many challenges towards routine clinical application of CBA remain. In a retrospective study of 5290 patients with available CBA data, we explored both clinicobiological associations and the clinical impact of CK in CLL. We found that patients with 655 abnormalities, defined as high-CK, exhibit uniformly dismal clinical outcome, independently of clinical stage, TP53 aberrations (deletion of chromosome 17p and or TP53 mutations, TP53abs) and the expression of somatically hypermutated (M-CLL) or unmutated (U-CLL) immunoglobulin heavy variable genes (IGHV). Thus, they contrasted CK cases with 3 or 4 aberrations (low-CK and intermediate-CK, respectively) who followed aggressive disease courses only in the presence of TP53abs. At the other end of the spectrum, patients with CK and +12,+19 displayed an exceptionally indolent profile. Building upon CK, TP53abs and IGHV gene somatic hypermutation status, we propose a novel hierarchical model where patients with high-CK exhibit the worst prognosis, while M-CLL lacking CK or TP53abs as well as CK with +12,+19 show the longest overall survival. In conclusion, CK should not be axiomatically considered unfavorable in CLL, representing a heterogeneous group with variable clinical behavior. High-CK with 655 chromosomal aberrations emerges as prognostically adverse, independently of other biomarkers. Prospective clinical validation is warranted before finally incorporating high-CK in risk stratification of CLL
Different spectra of recurrent gene mutations in subsets of chronic lymphocytic leukemia harboring stereotyped B-cell receptors.
We report on markedly different frequencies of genetic lesions within subsets of chronic lymphocytic leukemia patients carrying mutated or unmutated stereotyped B-cell receptor immunoglobulins in the largest cohort (n=565) studied for this purpose. By combining data on recurrent gene mutations (BIRC3, MYD88, NOTCH1, SF3B1 and TP53) and cytogenetic aberrations, we reveal a subset-biased acquisition of gene mutations. More specifically, the frequency of NOTCH1 mutations was found to be enriched in subsets expressing unmutated immunoglobulin genes, i.e. #1, #6, #8 and #59 (22-34%), often in association with trisomy 12, and was significantly different (P<0.001) to the frequency observed in subset #2 (4%, aggressive disease, variable somatic hypermutation status) and subset #4 (1%, indolent disease, mutated immunoglobulin genes). Interestingly, subsets harboring a high frequency of NOTCH1 mutations were found to carry few (if any) SF3B1 mutations. This starkly contrasts with subsets #2 and #3 where, despite their immunogenetic differences, SF3B1 mutations occurred in 45% and 46% of cases, respectively. In addition, mutations within TP53, whilst enriched in subset #1 (16%), were rare in subsets #2 and #8 (both 2%), despite all being clinically aggressive. All subsets were negative for MYD88 mutations, whereas BIRC3 mutations were infrequent. Collectively, this striking bias and skewed distribution of mutations and cytogenetic aberrations within specific chronic lymphocytic leukemia subsets implies that the mechanisms underlying clinical aggressiveness are not uniform, but rather support the existence of distinct genetic pathways of clonal evolution governed by a particular stereotyped B-cell receptor selecting a certain molecular lesion(s
- …